90 research outputs found
RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check
Chinese Spelling Check (CSC) refers to the detection and correction of
spelling errors in Chinese texts. In practical application scenarios, it is
important to make CSC models have the ability to correct errors across
different domains. In this paper, we propose a retrieval-augmented spelling
check framework called RSpell, which searches corresponding domain terms and
incorporates them into CSC models. Specifically, we employ pinyin fuzzy
matching to search for terms, which are combined with the input and fed into
the CSC model. Then, we introduce an adaptive process control mechanism to
dynamically adjust the impact of external knowledge on the model. Additionally,
we develop an iterative strategy for the RSpell framework to enhance reasoning
capabilities. We conducted experiments on CSC datasets in three domains: law,
medicine, and official document writing. The results demonstrate that RSpell
achieves state-of-the-art performance in both zero-shot and fine-tuning
scenarios, demonstrating the effectiveness of the retrieval-augmented CSC
framework. Our code is available at https://github.com/47777777/Rspell
Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments
Semantic role labeling (SRL) is a fundamental yet challenging task in the NLP
community. Recent works of SRL mainly fall into two lines: 1) BIO-based; 2)
span-based. Despite ubiquity, they share some intrinsic drawbacks of not
considering internal argument structures, potentially hindering the model's
expressiveness. The key challenge is arguments are flat structures, and there
are no determined subtree realizations for words inside arguments. To remedy
this, in this paper, we propose to regard flat argument spans as latent
subtrees, accordingly reducing SRL to a tree parsing task. In particular, we
equip our formulation with a novel span-constrained TreeCRF to make tree
structures span-aware and further extend it to the second-order case. We
conduct extensive experiments on CoNLL05 and CoNLL12 benchmarks. Results reveal
that our methods perform favorably better than all previous syntax-agnostic
works, achieving new state-of-the-art under both end-to-end and w/ gold
predicates settings.Comment: COLING 202
- …